Modelling and visualizing data

You will learn how to:

1 Interactive visualization: The core of Shiny

  • Shiny offers the perfect basis for visualization
    • Plots can be modified using UI inputs
    • Seamless integration of interactivity elements (e.g. pan, zoom)
    • Dashboards facilitate the idea of story-telling by providing context to plots

1.1 Good practice examples

  • Examples of these concepts can be seen in many Shiny apps, one example is Edward Parker’s COVID-19 tracker
Note

Exercise

Explore the COVID-19 tracker. Do you think this is a good Shiny app? If so, why? If not, why not?

COVID-19 Tracker

1.2 Plain plotting vs. Shiny

Feature Plain R Shiny Examples
Reactivity Changes in the visualization have to be changed in the code Visualizations can be modified on the fly using widgets like drop-down menus ExPanD
Interactivity Plots are static raster or vector images Plots can be dynamic and can be interacted with COVID-19 tracker
Narrativity Sense-making happens through manual annotation, e.g. in an article or a presentation Plots are embedded in a compilation of narrative elements that can tell a coherent story

Freedom of Press Shiny app

GRETA Analytics

Medium Reactivity Interactivity Narrativity
Plain image
Paper / report
Dashboard (e.g. Tableau) ☑️
Quarto / RMarkdown ☑️
Traditional website ☑️
Shiny

1.3 Current app state

  • In the last sections, we added a table and a plot and linked them to a number of inputs
  • The code chunk below contains the current app state
  • In this section, we will:
    • Augment the violin plot
    • Add an interactive map
Full code for the current app state
library(dplyr)
library(tidyr)
library(shiny)
library(plotly)
library(leaflet)
library(haven)

ess <- readRDS("../../../data/ess_trust.rds")
ess_geo <- readRDS("../../../data/ess_trust_geo.rds")

# UI ----
ui <- fluidPage(
  titlePanel("European Social Survey - round 10"),
  
  ## Sidebar ----
  sidebarLayout(
    sidebarPanel(
      ### select dependent variable
      selectInput(
        "xvar",
        label = "Select a dependent variable",
        choices = c(
          "Trust in country's parliament" = "trust_parliament",
          "Trust in the legal system" = "trust_legal",
          "Trust in the police" = "trust_police",
          "Trust in politicians" = "trust_politicians",
          "Trust in political parties" = "trust_parties",
          "Trust in the European Parliament" = "trust_eu",
          "Trust in the United Nations" = "trust_un"
        )
      ),
      
      ### select a variable ----
      selectInput(
        "yvar",
        label = "Select an independent variable",
        choices = c(
          "Placement on the left-right scale" = "left_right",
          "Age" = "age",
          "Feeling about household's income" = "income_feeling",
          "How often do you use the internet?" = "internet_use",
          "How happy are you?" = "happiness"
        )
      ),
      
      ### select a country ----
      selectizeInput(
        "countries",
        label = "Filter by country",
        choices = unique(ess$country),
        selected = "FR",
        multiple = TRUE
      ),
      
      ### filter values ----
      sliderInput(
        "range",
        label = "Set a value range",
        min = min(ess$trust_parliament, na.rm = TRUE),
        max = max(ess$trust_parliament, na.rm = TRUE),
        value = range(ess$trust_parliament, na.rm = TRUE),
        step = 1
      )
    ),
    
    ## Main panel ----
    mainPanel(
      tabsetPanel(
        type = "tabs",
        
        ### Table tab ----
        tabPanel(
          title = "Table",
          div(
            style = "height: 600px; overflow-y: auto;",
            tableOutput("table")
          )
        ),
        
        ### Plot tab ----
        tabPanel(
          title = "Histogram",
          plotOutput("plot", height = 600)
        ),
        
        ### Map tab ----
        tabPanel(
          title = "Map",
          leafletOutput("map", height = 600)
        )
      )
    )
  )
)


# Server ----
server <- function(input, output, session) {
  # update slider ----
  observe({
    var <- na.omit(ess[[input$xvar]])
    is_ordered <- is.ordered(var)
    var <- as.numeric(var)
    updateSliderInput(
      inputId = "range",
      min = min(var),
      max = max(var),
      value = range(var),
      step = if (is_ordered) 1
    )
  }) %>%
    bindEvent(input$xvar)
  
  # filter data ----
  filtered <- reactive({
    req(input$countries, cancelOutput = TRUE)
    
    xvar <- input$xvar
    yvar <- input$yvar
    range <- input$range
    
    # select country
    ess <- ess[ess$country %in% input$countries, ]
    
    # select variable
    ess <- ess[c("idno", "country", xvar, yvar)]
    
    # apply range
    ess <- ess[ess[[xvar]] > range[1] & ess[[xvar]] < range[2], ]
    ess
  })
  
  # render table ----
  output$table <- renderTable({
    filtered()
  }, height = 400)
  
  # render plot ----
  output$plot <- renderPlot({
    xvar <- input$xvar
    yvar <- input$yvar
    plot_data <- filtered() %>%
      drop_na() %>%
      mutate(across(where(is.numeric), .fns = as.ordered))
    
    ggplot(plot_data) +
      aes(x = .data[[xvar]], y = .data[[yvar]], group = .data[[xvar]]) +
      geom_violin(fill = "lightblue", show.legend = FALSE) +
      theme_classic()
  })
}

shinyApp(ui = ui, server = server)

1.4 Recap: Plotting in Shiny

  • Inserting plots in Shiny apps works just like any other UI component
  • You need two things: plotOutput() (or similar) in the UI and renderPlot() (or similar) in the server function
    • plotOutput() creates the empty element in the UI where the plot will go
    • renderPlot() renders the plot and updates the UI element every time a reactive dependency is invalidated

2 Interactivity

  • R itself is very bad at interactivity
  • Shiny supports some very essential interactivity through plotOutput
    • Not covered in this workshop! For a primer, check out chapter 7.1 of Mastering Shiny
  • All of today’s cool kids use interactivity through Javascript interfaces

3 Plotly

Plotly is an open-source library to create charts that can be interacted with in various way

It supports several languages including R and Python

Plotly is arguably the most renowned R package for interactive plotting

It even motivated an entire book: https://plotly-r.com/

3.1 Plotly’s grammar of graphics

  • Similar to ggplot2, R plotly defines its own grammar of graphics
  • A plotly canvas is created with plot_ly()
  • Additional plot elements can be added through pipes %>% or |>
mtcars$am[which(mtcars$am == 0)] <- 'Automatic'
mtcars$am[which(mtcars$am == 1)] <- 'Manual'
mtcars$am <- as.factor(mtcars$am)


plot_ly(
  mtcars,
  x = ~wt,
  y = ~hp,
  z = ~qsec,
  color = ~am,
  colors = c('#BF382A', '#0C4B8E')
) %>%
  add_markers() %>%
  layout(scene = list(
    xaxis = list(title = 'Weight'),
    yaxis = list(title = 'Gross horsepower'),
    zaxis = list(title = '1/4 mile time')
  ))
1
Variables such as x, y, z and color are defined as formulas in a call to plot_ly. This is comparable to calling ggplot(aes(x, y, z, color)).
2
The plot type is added through a pipe. This is comparable to ggplot2 functions such as geom_point or geom_bar.
3
Visual sugar is then added by calling layout and manually editing the axis titles.

3.2 Quick and dirty interactivity

  • One important advantage of plotly is that you do not need to learn its grammar
  • ggplot2 plots can very easily be converted to an interactive plotly plot:
p <- ggplot(iris) +
  geom_point(aes(Sepal.Width, Sepal.Length))
ggplotly(p)

3.3 Extending plotly

3.3.1 Customization

  • We can extend Plotly objects using three functions:
    • layout() changes the plot organisation (think ggplot2::theme()), e.g.:
      • colors, sizes, fonts, positions, titles, ratios and alignment of all kinds of plot elements
      • updatemenus adds buttons or drop down menus that can change the plot style or layout (see here for examples)
      • sliders adds sliders that can be useful for time series (see here for examples)
    • config() changes interactivity configurations, e.g.:
      • The modeBarButtons options and displaylogo control the buttons in the mode bar
      • toImageButtonOptions controls the format of plot downloads
      • scrollZoom enables or disables zooming by scrolling
    • style() changes data-level attributes (think ggplot2::scale_), e.g.:
      • hoverinfo controls whether tooltips are shown on hover
      • mode controls whether to show points, lines and/or text in a scatter plot
      • hovertext modifies the tooltips texts shown on hover

3.3.2 Schema

  • The actual number of options is immense!
  • You can explore all options by calling plotly::schema()
schema()

3.3.3 Example

p <- ggplot(iris) +
  geom_point(aes(Sepal.Width, Sepal.Length))

ggplotly(p) %>%
  config(
    modeBarButtonsToRemove = c(
      "sendDataToCloud", "zoom2d", "select2d", "lasso2d", "autoScale2d",
      "hoverClosestCartesian", "hoverCompareCartesian", "resetScale2d"
    ),

    displaylogo = FALSE,

    toImageButtonOptions = list(
      format = "svg",
      filename = "plot",
      height = NULL,
      width = NULL
    ),

    scrollZoom = TRUE
  )
1
Removes specified buttons from the modebar.
2
Removes the Plotly logo.
3
Changes the output of snapshots taken of the plot. Setting height and width to NULL keeps the aspect ratio of the plot as it is shown in the app.
4
Changes the output of snapshots taken of the plot. Setting height and width to NULL keeps the aspect ratio of the plot as it is shown in the app.

3.4 Plotly and Shiny

  • Since plotly does not produce static plots like ggplot2, it cannot be served by plotOutput and renderPlot
  • Plotly defines two new functions:
    • plotlyOutput on the UI side
    • renderPlotly on the server side

UI:

mainPanel(
  tabsetPanel(
    type = "tabs",
    
    ### Table tab ----
    tabPanel(
      title = "Table",
      div(
        style = "height: 600px; overflow-y: auto;",
        tableOutput("table")
      )
    ),
    
    ### Plot tab ----
    tabPanel(
      title = "Histogram",
      plotlyOutput("plot", height = 600)
    )
  )
)

Server:

output$plot <- renderPlotly({
  xvar <- input$xvar
  yvar <- input$yvar
  plot_data <- filtered() %>%
    drop_na() %>%
    mutate(across(where(is.numeric), .fns = as.ordered))
  
  p <- ggplot(plot_data) +
    aes(x = .data[[xvar]], y = .data[[yvar]], group = .data[[xvar]]) +
    geom_violin(fill = "lightblue", show.legend = FALSE) +
    theme_classic()
  ggplotly(p)
})
Complete code (important lines are highlighted)
library(dplyr)
library(tidyr)
library(shiny)
library(plotly)
library(leaflet)
library(haven)

ess <- readRDS("../../../data/ess_trust.rds")
ess_geo <- readRDS("../../../data/ess_trust_geo.rds")

# UI ----
ui <- fluidPage(
  titlePanel("European Social Survey - round 10"),
  
  ## Sidebar ----
  sidebarLayout(
    sidebarPanel(
      ### select dependent variable
      selectInput(
        "xvar",
        label = "Select a dependent variable",
        choices = c(
          "Trust in country's parliament" = "trust_parliament",
          "Trust in the legal system" = "trust_legal",
          "Trust in the police" = "trust_police",
          "Trust in politicians" = "trust_politicians",
          "Trust in political parties" = "trust_parties",
          "Trust in the European Parliament" = "trust_eu",
          "Trust in the United Nations" = "trust_un"
        )
      ),
      
      ### select a variable ----
      selectInput(
        "yvar",
        label = "Select an independent variable",
        choices = c(
          "Placement on the left-right scale" = "left_right",
          "Age" = "age",
          "Feeling about household's income" = "income_feeling",
          "How often do you use the internet?" = "internet_use",
          "How happy are you?" = "happiness"
        )
      ),
      
      ### select a country ----
      selectizeInput(
        "countries",
        label = "Filter by country",
        choices = unique(ess$country),
        selected = "FR",
        multiple = TRUE
      ),
      
      ### filter values ----
      sliderInput(
        "range",
        label = "Set a value range",
        min = min(ess$trust_parliament, na.rm = TRUE),
        max = max(ess$trust_parliament, na.rm = TRUE),
        value = range(ess$trust_parliament, na.rm = TRUE),
        step = 1
      )
    ),
    
    ## Main panel ----
    mainPanel(
      tabsetPanel(
        type = "tabs",
        
        ### Table tab ----
        tabPanel(
          title = "Table",
          div(
            style = "height: 600px; overflow-y: auto;",
            tableOutput("table")
          )
        ),
        
        ### Plot tab ----
        tabPanel(
          title = "Histogram",
          plotlyOutput("plot", height = 600)
        ),
        
        ### Map tab ----
        tabPanel(
          title = "Map",
          leafletOutput("map", height = 600)
        )
      )
    )
  )
)


# Server ----
server <- function(input, output, session) {
  # update slider ----
  observe({
    var <- na.omit(ess[[input$xvar]])
    is_ordered <- is.ordered(var)
    var <- as.numeric(var)
    updateSliderInput(
      inputId = "range",
      min = min(var),
      max = max(var),
      value = range(var),
      step = if (is_ordered) 1
    )
  }) %>%
    bindEvent(input$xvar)
  
  # filter data ----
  filtered <- reactive({
    req(input$countries, cancelOutput = TRUE)
    
    xvar <- input$xvar
    yvar <- input$yvar
    range <- input$range
    
    # select country
    ess <- ess[ess$country %in% input$countries, ]
    
    # select variable
    ess <- ess[c("idno", "country", xvar, yvar)]
    
    # apply range
    ess <- ess[ess[[xvar]] > range[1] & ess[[xvar]] < range[2], ]
    ess
  })
  
  # render table ----
  output$table <- renderTable({
    filtered()
  }, height = 400)
  
  # render plot ----
  output$plot <- renderPlotly({
    xvar <- input$xvar
    yvar <- input$yvar
    plot_data <- filtered() %>%
      drop_na() %>%
      mutate(across(where(is.numeric), .fns = as.ordered))
    
    p <- ggplot(plot_data) +
      aes(x = .data[[xvar]], y = .data[[yvar]], group = .data[[xvar]]) +
      geom_violin(fill = "lightblue", show.legend = FALSE) +
      theme_classic()
    ggplotly(p)
  })
}

shinyApp(ui = ui, server = server)

Exercises

Note

Exercise 1

Thinking back to our initial visualization structure (data selection, data exploration, data modelling, ???), what could be a good last step? What type of visualization can enhance our understanding of the relationship among the Guerry variables? Write down your ideas along with possible types of visualizations.

Visit the R graph gallery or the R Plotly gallery to get inspired!

Note

Exercise 2

Add a fourth quadrant in the dashboard layout. Also add a box that will hold the content.

Don’t forget about the use of fluidRow() and column()! The new quadrant goes right below the pair plot in the second column:

fluidRow(
    column(
        width = 6,
        box(width = 12), # quadrant 1
        box(width = 12) # quadrant 3
    ),
    column(
        width = 6,
        box(width = 12), # quadrant 2
        box(width = 12) # quadrant 4
    )
)

Solution

The fourth quadrant is the second row of the second column, i.e.:

fluidRow(
    column(
        width = 6,
        box(width = 12), # quadrant 1
        box(width = 12) # quadrant 3
    ),
    column(
        width = 6,
        box(width = 12), # quadrant 2
        box(width = 12, status = "primary", plotOutput("new_plot")) # quadrant 4
    )
)
Note

Exercise 3

Add a UI output and an empty rendering function

The respective plotly functions are plotly::plotlyOutput() and plotly::renderPlotly()

Note

Exercise 4

Implement the visualization from exercise 1 within the new box from exercise 2. Create your plot using ggplot2 and convert it to a plotly chart using ggplotly()

Note

Exercise 5

Remove all mode bar buttons except “Zoom in” and “Zoom out” from the new visualization of exercise 4

The relevant function is plotly::config()

Call schema() and explore object -> config to find out about ways to remove mode bar buttons

A list of modebar buttons is provided on Plotly’s GitHub repository or under object -> layout -> layoutAttributes -> modebar -> remove

Solution

To remove modebar buttons, we need to change the plotly::config() of the generated plot output:

ggplotly(p) %>%
  config(modeBarButtonsToRemove = c(
    "sendDataToCloud", "zoom2d", "select2d", "lasso2d", "autoScale2d", "toimage",
    "hoverClosestCartesian", "hoverCompareCartesian", "resetScale2d", "pan"
))
Note

Exercise 6

Change the axis width of the new graph from exercise 4 to 5 pixels and color to #000

The relevant function is plotly::layout()

Call schema() and explore object -> layout -> layoutAttributes to find out about ways to change the axis layout

Solution

To change the axis width, we need to change the plotly::layout() of the plotly object. Determining which option controls the axis layout is a tricky question. To do that, we can explore the plotly::schema(). In this case, the relevant option is found unter object -> layout -> layoutAttributes -> xaxis/yaxis -> linewidth/linecolor. Then, just add a layout to the plot object and change the relevant options:

ggplotly(p) %>%
    layout(
      xaxis = list(linewidth = 5, linecolor = "#000"),
      yaxis = list(linewidth = 5, linecolor = "#000")
    )
Note

Exercise 7

Currently, we have three input widgets to change the appearance of plots: model_x, model_y, and model_std. Implement another input widget that allows users to manipulate the data, output or the plot appearance.

Should the new input widget change all plots or just a selection of plots? Should the new widgets control the way data is cleaned (e.g. normalising), analysed (e.g. different modelling approaches) or displayed (e.g. plot theming)?